Data Modeling Concepts

The Model tool offers powerful tools to end users for preparing and querying data. There are several key concepts that will better explain how this toolset works in Pyramid - regardless of which modeling interface is used.

External vs Pyramid Models

The Model tools are there to allow users to ultimately build the semantic layer through which all users can query and analyze data in a data source. However, if a data source includes its own semantic layer ("external" model), then Pyramid's modeling tools are not necessarily required. Pyramid offers near identical functionality across both external and internal models - covering every aspect of the platform including data visualizations, formulations, presentations and publications.

  • Click here for more details on using external models versus Pyramid's internal models.

Model Files vs Deployed Models

A critical concept for Model is the file structure. Data flows and Semantic data model's (see below) are built and saved in model definition files. These are saved in the content management system - like any other analytic content file in Pyramid. However, when a model file is run, it produces databases and models (aka "materialization"). These materialized artifacts are quite separate from the definition files, have their own independent security and management tools.

  • Click here for more on model files and deployed models

Data Flows vs Semantic Models

The Model tools offer two core functions:

  • Data Flows - which are used for blending data sources, fixing data issues and data embellishments (like machine learning) in a pipeline framework.
  • Data Models - which are used to define virtual semantic data models to instruct the tools how to query the database.

While these 2 functions go hand-in-hand, its possible for users to employ one aspect without the other, or to separate them out into 2 distinct operations.

Materialization vs Virtualization

As described above, Pyramid's semantic data models are virtual. That means they act as a light layer against any data source WITHOUT ingesting the data into Pyramid's internal engines first. It also means users can create multiple semantic models on the same database, without creating multiple copies of the data.

However, when data needs to be blended ("mashed-up"), fixed or embellished (e.g. with ML) this requires a new version of the data to be materialized. Pyramid allows users to WRITE data back to multiple data engines, including its own In-Memory database using the data flow tools. Importantly if customers prefer, they can also use any other ETL tool to create the databases.

  • Click here for more on Data Flows and Semantic Data Models